install.packages('dplyr')
install.packages('plotly')
library(plotly)
Loading required package: ggplot2
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
wines_dt <- read.csv2("wines.csv", row.names=1)
wines_dt.white_wines <- wines_dt %>% filter(Vinho == 'WHITE')
summary(wines_dt.white_wines)
fixedacidity volatileacidity citricacid residualsugar chlorides
Min. : 3.800 Min. :0.0800 Min. :0.0000 Min. : 0.600 Min. :0.00900
1st Qu.: 6.300 1st Qu.:0.2100 1st Qu.:0.2700 1st Qu.: 1.700 1st Qu.:0.03600
Median : 6.800 Median :0.2600 Median :0.3200 Median : 5.200 Median :0.04300
Mean : 6.855 Mean :0.2782 Mean :0.3342 Mean : 6.387 Mean :0.04577
3rd Qu.: 7.300 3rd Qu.:0.3200 3rd Qu.:0.3900 3rd Qu.: 9.900 3rd Qu.:0.05000
Max. :14.200 Max. :1.1000 Max. :1.6600 Max. :45.800 Max. :0.34600
freesulfurdioxide totalsulfurdioxide density pH sulphates
Min. : 2.00 Min. : 9.0 Min. :0.9871 Min. :2.720 Min. :0.2200
1st Qu.: 23.00 1st Qu.:108.0 1st Qu.:0.9917 1st Qu.:3.090 1st Qu.:0.4100
Median : 34.00 Median :134.0 Median :0.9937 Median :3.180 Median :0.4700
Mean : 35.31 Mean :138.4 Mean :0.9940 Mean :3.188 Mean :0.4898
3rd Qu.: 46.00 3rd Qu.:167.0 3rd Qu.:0.9961 3rd Qu.:3.280 3rd Qu.:0.5500
Max. :289.00 Max. :440.0 Max. :1.0140 Max. :3.820 Max. :1.0800
alcohol quality Vinho
Min. : 8.00 Min. :3.000 RED : 0
1st Qu.: 9.50 1st Qu.:5.000 WHITE:4898
Median :10.40 Median :6.000
Mean :10.51 Mean :5.878
3rd Qu.:11.40 3rd Qu.:6.000
Max. :14.20 Max. :9.000
Vamos agora seguir com nossa análise utilizando apenas os vinhos brancos. Como boa parte da análise exploratória já foi realizada nos outros markdowns, vamos analisar apenas o que restou: outliers, PCA e a variância.
Com toda a análise feita, finalmente poderemos aplicar modelos preditivos.
Precisamos entender os outliers, pois eles podem ter um impacto direto em nossa predição posteriormente. Vamos trabalhar com 1,5 vezes a distância interquartílica em cada campo(Entre 25% a 75% dos dados) utilizando uma abordagem univariável para variáveis contínuas.
plot_ly(y=wines_dt.white_wines$fixedacidity, main='fixedacidity', type = 'box', name = 'Acidez Fixa')
replacing previous import by ‘Rcpp::evalCpp’ when loading ‘later’replacing previous import by ‘shiny::validateCssUnit’ when loading ‘crosstalk’replacing previous import by ‘shiny::br’ when loading ‘crosstalk’replacing previous import by ‘shiny::tags’ when loading ‘crosstalk’replacing previous import by ‘shiny::div’ when loading ‘crosstalk’replacing previous import by ‘shiny::h1’ when loading ‘crosstalk’replacing previous import by ‘shiny::h2’ when loading ‘crosstalk’replacing previous import by ‘shiny::h3’ when loading ‘crosstalk’replacing previous import by ‘shiny::h4’ when loading ‘crosstalk’replacing previous import by ‘shiny::h5’ when loading ‘crosstalk’replacing previous import by ‘shiny::h6’ when loading ‘crosstalk’replacing previous import by ‘shiny::knit_print.html’ when loading ‘crosstalk’replacing previous import by ‘shiny::tagSetChildren’ when loading ‘crosstalk’replacing previous import by ‘shiny::includeScript’ when loading ‘crosstalk’replacing previous import by ‘shiny::em’ when loading ‘crosstalk’replacing previous import by ‘shiny::tagAppendChild’ when loading ‘crosstalk’replacing previous import by ‘shiny::is.singleton’ when loading ‘crosstalk’replacing previous import by ‘shiny::includeHTML’ when loading ‘crosstalk’replacing previous import by ‘shiny::includeMarkdown’ when loading ‘crosstalk’replacing previous import by ‘shiny::code’ when loading ‘crosstalk’replacing previous import by ‘shiny::tagList’ when loading ‘crosstalk’replacing previous import by ‘shiny::a’ when loading ‘crosstalk’replacing previous import by ‘shiny::tagAppendAttributes’ when loading ‘crosstalk’replacing previous import by ‘shiny::singleton’ when loading ‘crosstalk’replacing previous import by ‘shiny::hr’ when loading ‘crosstalk’replacing previous import by ‘shiny::p’ when loading ‘crosstalk’replacing previous import by ‘shiny::suppressDependencies’ when loading ‘crosstalk’replacing previous import by ‘shiny::tagAppendChildren’ when loading ‘crosstalk’replacing previous import by ‘shiny::includeText’ when loading ‘crosstalk’replacing previous import by ‘shiny::pre’ when loading ‘crosstalk’replacing previous import by ‘shiny::span’ when loading ‘crosstalk’replacing previous import by ‘shiny::withTags’ when loading ‘crosstalk’replacing previous import by ‘shiny::htmlTemplate’ when loading ‘crosstalk’replacing previous import by ‘shiny::img’ when loading ‘crosstalk’replacing previous import by ‘shiny::tag’ when loading ‘crosstalk’replacing previous import by ‘shiny::includeCSS’ when loading ‘crosstalk’replacing previous import by ‘shiny::knit_print.shiny.tag’ when loading ‘crosstalk’replacing previous import by ‘shiny::knit_print.shiny.tag.list’ when loading ‘crosstalk’replacing previous import by ‘shiny::strong’ when loading ‘crosstalk’replacing previous import by ‘shiny::HTML’ when loading ‘crosstalk’'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
plot_ly(y=wines_dt.white_wines$volatileacidity, main='residualsugar', type = 'box', name = 'Volatilidade do Ácido')
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
plot_ly(y=wines_dt.white_wines$citricacid, main='citricacid', type = 'box', name = 'Ácido Cítrico')
plot_ly(y=wines_dt.white_wines$residualsugar, main='residualsugar', type = 'box', name = 'Açúcar Residual')
plot_ly(y=wines_dt.white_wines$chlorides, main='chlorides', type = 'box', name = 'Cloretos')
plot_ly(y=wines_dt.white_wines$freesulfurdioxide, main='freesulfurdioxide', type = 'box', name = 'Dióxido de Enxofre Livre')
plot_ly(y=wines_dt.white_wines$totalsulfurdioxide, main='totalsulfurdioxide', type = 'box', name = 'Total de Dióxido de Enxofre')
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
outliers_residual_sugar <-quantile(wines_dt.white_wines$residualsugar,.75,type=2)-quantile(wines_dt.white_wines$residualsugar,.25,type=2)
outliers_residual_sugar
75%
8.2
plot_ly(y=wines_dt.white_wines$density, main='density', type = 'box', name = 'Densidade')
plot_ly(y=wines_dt.white_wines$pH, main='pH', type = 'box', name = 'pH')
plot_ly(y=wines_dt.white_wines$sulphates, main='sulphates', type = 'box', name = 'Sulfatos')
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
plot_ly(y=wines_dt.white_wines$alcohol, main='alcohol', type = 'box', name = 'Álcool')
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
'box' objects don't have these attributes: 'main'
Valid attributes include:
'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'selectedpoints', 'hoverinfo', 'hoverlabel', 'stream', 'transforms', 'y', 'x', 'x0', 'y0', 'text', 'whiskerwidth', 'notched', 'notchwidth', 'boxpoints', 'boxmean', 'jitter', 'pointpos', 'orientation', 'marker', 'line', 'fillcolor', 'selected', 'unselected', 'hoveron', 'xcalendar', 'ycalendar', 'xaxis', 'yaxis', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'ysrc', 'xsrc', 'textsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'